Introduction to PyTorch: Why Tensors Matter
PyTorch is a highly flexible, dynamic open-source framework favored for deep learning research and rapid prototyping. At its core, the Tensor is the indispensable data structure. It is a multi-dimensional array designed to efficiently handle numerical operations required for Deep Learning models, supporting GPU acceleration automatically.
1. Understanding Tensor Structure
Every input, output, and model parameter in PyTorch is encapsulated in a Tensor. They serve the same purpose as NumPy arrays but are optimized for processing on specialized hardware like GPUs, making them far more efficient for the large-scale linear algebra operations required by neural networks.
Key properties define the tensor:
- Shape: Defines the dimensions of the data, expressed as a tuple (e.g., $4 \times 32 \times 32$ for a batch of images).
- Dtype: Specifies the numeric type of elements stored (e.g.,
torch.float32for model weights,torch.int64for indexing). - Device: Indicates the physical hardware location: typically
'cpu'or'cuda'(NVIDIA GPU).
Dynamic Graph and Autograd
PyTorch uses an imperative execution model, meaning the computational graph is built as operations are executed. This enables the built-in automatic differentiation engine, Autograd, to track every operation on a Tensor, provided the property
requires_grad=True is set, allowing for easy calculation of gradients during backpropagation.
TERMINAL
bash — pytorch-env
> Ready. Click "Run" to execute.
>
TENSOR INSPECTOR
Live
Run code to inspect active tensors
Question 1
Which command creates a $5 \times 5$ tensor containing random numbers following a uniform distribution between 0 and 1?
Question 2
If tensor $A$ is on the CPU, and tensor $B$ is on the CUDA device, what happens if you try to compute $A + B$?
Question 3
What is the most common data type (dtype) used for model weights and intermediate calculations in Deep Learning?
Challenge: Tensor Manipulation and Shape
Prepare a tensor for a specific matrix operation.
You have a feature vector $F$ of shape $(10,)$. You need to multiply it by a weight matrix $W$ of shape $(10, 5)$. For matrix multiplication (MatMul) to work, $F$ must be 2-dimensional.
Step 1
What should the shape of $F$ be before multiplication with $W$?
Solution:
The inner dimensions must match, so $F$ must be $(1, 10)$. Then $(1, 10) @ (10, 5) \rightarrow (1, 5)$.
Code:
The inner dimensions must match, so $F$ must be $(1, 10)$. Then $(1, 10) @ (10, 5) \rightarrow (1, 5)$.
Code:
F_new = F.unsqueeze(0) or F_new = F.view(1, -1)
Step 2
Perform the matrix multiplication between $F_{new}$ and $W$ (shape $(10, 5)$).
Solution:
The operation is straightforward MatMul.
Code:
The operation is straightforward MatMul.
Code:
output = F_new @ W or output = torch.matmul(F_new, W)
Step 3
Which method explicitly returns a tensor with the specified dimensions, allowing you to flatten the tensor back to $(50,)$? (Assume $F$ was $(5, 10)$ initially and is now flattened.)
Solution:
Use the
Code:
Use the
view or reshape methods. The fastest way to flatten is often using -1 for one dimension.Code:
F_flat = F.view(-1) or F_flat = F.reshape(50)